Fusion of Acoustic and Linguistic Speech Features for Emotion Detection

نویسندگان

  • Florian Metze
  • Tim Polzehl
  • Michael Wagner
چکیده

This paper describes a system that deploys acoustic and linguistic information from speech in order to decide whether the utterance contains negative or nonnegative meaning. An earlier version of this system was submitted to the Interspeech-2009 Emotion Challenge evaluation. The speech data consist of short utterances of the children’s speech, and the proposed system is designed to detect anger in each given chunk. Various frame-based cepstral, prosodic and acoustic features are extracted automatically and classified by means of a support vector machine. An automatic speech recognizer transcribes the utterances and yields a separate classification, based on the degree of emotional salience of the words. The emotionally salient words are computed on word hypotheses, so that un-transcribed training data is sufficient. Late fusion is applied to make a final decision on anger vs. non-anger of the utterance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Study on Feature Selection and Fusion Schemes for Emotion Recognition from Speech

— The automatic analysis of speech to detect affective states may improve the way users interact with electronic devices. However, the analysis only at the acoustic level could be not enough to determine the emotion of a user in a realistic scenario. In this paper we analyzed the spontaneous speech recordings of the FAU Aibo Corpus at the acoustic and linguistic levels to extract two sets of fe...

متن کامل

A Study of the Relationship between Acoustic Features of “bæle” and the Paralinguistic Information

Language users benefit from special phonetic tools in order to communicate linguistic information as well as different emotional aspects and paralinguistic information through daily conversation. Having functions in conveying semantic information to listeners, prosodic features form the essential part of linguistic behavour, manipulating  them potentially can play an important role in transmitt...

متن کامل

Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles

Herein we present a comparison of novel concepts for a robust fusion of prosodic and verbal cues in speech emotion recognition. Thereby 276 acoustic features are extracted out of a spoken phrase. For linguistic content analysis we use the Bag-of-Words text representation. This allows for integration of acoustic and linguistic features within one vector prior to a final classification. Extensive...

متن کامل

Improving Spontaneous Children's Emotion Recognition by Acoustic Feature Selection and Feature-Level Fusion of Acoustic and Linguistic Parameters

This paper presents an approach to improve emotion recognition from spontaneous speech. We used a wrapper method to reduce an acoustic set of features and feature-level fusion to merge them with a set of linguistic ones. The proposed system was evaluated with the FAU Aibo Corpus. We considered the same emotion set that was proposed in the Interspeech 2009 Emotion Challenge. The main contributio...

متن کامل

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009